The SPARTACUS-Database: a Spanish Sentence Database for Offline Handwriting Recognition
نویسندگان
چکیده
In this paper we describe a database that consists of offline handwritten Spanish sentences from four different subtasks. The database includes 1 500 forms produced by the same number of writers. A total of around 100 000 word instances out of a vocabulary of around 3 300 words occur in the collection. This database is intended to be used for offline handwriting recognition tasks. However, this database is expected to be specially useful for recognition systems that may take advantage of language models of restricted-semantic tasks. The database also includes a few image-processing procedures for extraction of handwritten text images from the forms and segmentation of the images into lines and words.
منابع مشابه
Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملRejection Strategies in Handwriting Recognition Systems
This master thesis investigates multiple rejection strategies for offline handwritten sentence recognition. The rejection strategies are implemented as a post-processing step of a Hidden Markov Model based text recognition system, and are based on confidence measures derived from a list of additional candidate sentences produced by the recogniser. Four different reject models are presented and ...
متن کاملIsolated Persian/Arabic handwriting characters: Derivative projection profile features, implemented on GPUs
For many years, researchers have studied high accuracy methods for recognizing the handwriting and achieved many significant improvements. However, an issue that has rarely been studied is the speed of these methods. Considering the computer hardware limitations, it is necessary for these methods to run in high speed. One of the methods to increase the processing speed is to use the computer pa...
متن کاملA Full English Sentence Database for Off-line Handwriting Recognition
In this paper we present a new database for off-line handwriting recognition, together with a few preprocessing and text segmentation procedures. The database is based on the Lancaster-Oslo/Bergen(LOB) corpus. This corpus is a collection of texts that were used to generate forms, which subsequently were filled out by persons with their handwriting. Up to now (December 1998) the database include...
متن کاملSentence Recognition through Hybrid Neuro-Markovian Modeling
This paper focuses on designing a handwriting recognition system dealing with on-line signal, i.e. temporel handwriting signal captured through an electronic pen or a digitalized tablet. We present here some new results concerning a hybrid on-line handwriting recognition system based on Hidden Markov Models (HMMs) and Neural Networks (NNs), which has already been presented in several contributi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004